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(57) On identifie des peptides et des acides nucleiques 
biologiquement actifs par im procede qui comprend les 
etapes suivantes: (a) fabrication d'un groupe de vecteurs 
adequats contenant chacun des sequences d'ADN 
totalement ou partiellement aleatoires, (b) transduction 
efficace desdits vecteurs en plusieurs cellules eucaryotes 
identiques, de maniere a ce qu'un seul acide 
ribonucleique et eventuellement un peptide soient 
exprimes ou a ce qu'un nombre limite de differents 
acides ribonucleiques et peptides aleatoires soient 
exprimes par chaque cellule, (c) criblage des cellules 
transduites pour veiifier si certaines d'entre elles ont 



(57) Biologically active peptides and nucleic acids are 
identified by a method comprising the following steps: 
(a) production of a pool of appropriate vectors each 
containing totally or partly random DNA sequences, (b) 
efficient transduction of said vectors into a number of 
identical eukaryotic cells in such a way that a single 
ribonucleic acid and possibly peptide is expressed or a 
limited number of different random ribonucleic acids 
and peptides are expressed by each cell, (c) screening of 
said transduced cells to see whether some of them have 
changed a certain phenotypic trait, (d) selection and 
cloning of said changed cells, (e) isolation and 
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modifie un quelconque trait phenotypique, (d) selection sequencing of the vector DNA in said phenotypically 

et clonage de cellules modifiees, (e) isolement et changed cells, and (f) deducing the ribonucleic acid and 

sequencage de TADN vecteur dans les cellules peptide sequences from the DNA sequence. The peptide 

modifiees au niveau du phenotype, et (f) deduction des sequences may be introduced into or fused to a larger 

sequences d'acides ribonucleiques et de peptides de la protein preferably an antibody molecule or a fragment 

sequence d'ADN. On peut fusionner ou introduire les thereof. This may be obtained by introducing the random 

sequences peptidiques dans une proteine plus DNA sequences into or fusing them to a DNA sequence 

importante, de preference une molecule d'anticorps ou encoding such larger protein, 
un de ses fragments. On realise cette operation en 
introduisant ou en fusionnant les sequences d 5 ADN 
aleatoires dans une sequence d'ADN codant cette 
proteine plus importante. 
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(57) Abstract 



Biologically active peptides 
and nucleic acids are identified by 
a method comprising the following 
steps: (a) production of a pool of 
appropriate vectors each containing 
totally or partly random DNA 
sequences, (b) efficient transduction 
of said vectors into a number of 
identical eukaryotic cells in such a 
way that a single ribonucleic acid 
and possibly peptide is expressed or 
a limited number of different random 
ribonucleic acids and peptides are 
expressed by each cell, (c) screening 
of said transduced ceils to see whether 
some of them have changed a certain 
phenorypic trait, (d) selection and 
cloning of said changed cells, (e) 
isolation and sequencing of the vector 
DNA in said phenotypically changed 
cells, and (0 deducing the ribonucleic 
acid and peptide sequences from 
the DNA sequence. The peptide 
sequences may be introduced into 
or fused to a larger protein preferably 
an antibody molecule or a fragment 
thereof. This may be obtained 
by introducing the random DNA 
sequences into or fusing them to a 
DNA sequence encoding such larger protein. 
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A METHOD FOR IDENTIFICATION OF BIOLOGICALLY ACTIVE PEP- 
TIDES AND NUCLEIC ACIDS 



5 This invention concerns a novel method for identification 
of new peptides and post-translationally modified pep- 
tides as well as nucleic acids with biological activity. 

BACKGROUND OF THE INVENTION 

10 

During the last five years the technology for expressing, 
testing and identifying millions of different random pep- 
tide sequences has evolved dramatically. Such peptide li- 
braries can be used for identification of new biologi- 
15 cally active peptides, and therefore the technology has 
added an exciting and very promising new epoch to the 
field of drug development. 

The known peptide library techniques can at present be 
20 divided into two fundamentally different groups: The ran- 
dom synthetic peptide libraries, in which the random pep- 
tides are produced chemically, and the random biosyn- 
thetic peptide libraries, in which the random peptides 
are encoded by partly or totally random DNA sequences and 
25 subsequently synthesized by ribosomes ^ 

The synthetic peptide libraries. 

Synthetic peptide libraries containing millions of pep- 
30 tides can be produced by combinatorial peptide chemistry 
and may either be synthesized in soluble form (R.A. 
Houghten et al. Nature, 354, 84-86, 1991) or remain immo- 
bilized on the peptide resin beads (A. Furka et al., Int. 
J. Peptide Protein Res. 37, 487-493, 1991; K. S. Lam et 
35 al., Nature, 354, 82-84, 1991). Using either of these ap- 
proaches different receptor ligands have been isolated. 
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The advantage of the soluble peptide libraries compared 
to solid phase immobilized peptide libraries is that 
soluble peptides may bind more sterically unhindered to 
the receptors in question. In the synthetic peptide li- 

5 brary technique proposed by Furka et al. 1991 and Lam et 
al. 1991, respectively, the most important improvement , 
on the other hand, was the approach of having only one 
type of peptide sequence on each bead ( "One bead - one 
peptide"). This enables direct selection and eventually 

10 sequencing of the putative active peptide ligand on a 
single bead using e.g. Edman degradation. Using this 
technology active peptides including peptides consisting 
of D-amino acids or other unnatural amino acids can be 
identified (B. Gissel et al. J. Peptide Science. In 

15 Press, 1995). 

The biosynthetic peptide libraries. 

Bacteriophage expression vectors have been constructed 
20 that can display peptides on the phage surface (S.E. 
Cwirla et al, Proc. Natl. Acad. Sci. USA, 87, 6378-6382, 
1990) . Each peptide is encoded by a randomly mutated re- 
gion of the phage genome, so sequencing of the relevant 
DNA region from the bacteriophage found to bind a recep- 
25 tor will reveal the amino acid sequence of the peptide 
ligand. A phage containing a peptide ligand is detected 
by repeated panning procedures which enrich the phage 
population for a strong receptor binding phage (J.K. 
Scott, TIBS, 17, 241-245, 1992). 

Bacteria have also been used for expression of similar 
peptide libraries. The peptides can be fused to an exr 
ported protein, such as an antibody, which can be immobi- 
lized on a solid support. By screening the solid support 
35 with an appropriate soluble receptor the bacterial clone 
producing the putative peptide ligand can be identified 
(M.G. Cull et al, Proc. Natl. Acad. Sci. USA, 89, 1865- 
1869, 1992). 
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Another described method for preparing large pools of 
different possibly active compounds is by the use of li- 
braries consisting of randomized ribonucleotides or de- 
oxyribonucleotides, the so-called aptamer libraries. The 
5 aptamers are generated in E. coli from a plasmid vector 
containing randomized DNA. From these libraries struc- 
tures with biological activity have been identified (L.C. 
Bock et al, Nature, 355, 564-566, 1992). 

10 PURPOSE OF THE INVENTION 

In order to use the prior art methods for identifying 
biologically active peptides and nucleic acids or their 
respective cellular target proteins, it is necessary to 

15 possess a detailed knowledge about the molecular mecha- 
nisms involved in a certain biological process. If these 
mechanisms are known, it may subsequently be possible to 
develop antagonists or agonists of targets (receptors, 
enzymes, etc.) involved using said methods. The problem 

20 to be solved by the present invention is i.a. to overcome 
the need for said detailed knowledge. 

SUMMARY OF THE INVENTION 

25 According to the present invention the peptide sequences, 
or the ribonucleic acids are identified from biosyntheti- 
cally expressed eukaryotic libraries containing millions 
of partly or totally random peptides and ribonucleic ac- 
ids. In connection with this invention the term "peptide" 

30 shall be understood to comprise also a peptide sequence 
introduced into or fused to a larger protein (e.g. an an- 
tibody) . The peptides and ribonucleic acids are synthe- 
sized by the cells from random DNA sequences which have 
been effectively transduced into the cells. Some of the 

35 peptides or ribonucleic acids in the library will affect 
important biological functions in the cells which express 
them. Cells which change phenotype due to the presence of 
such substances can be isolated, and their chemical 
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structure can subsequently be clarified by sequencing of 
the DNA which encodes them. Such peptides could possibly 
be used therapeutically or as lead compounds for future 
drug development or they can be used for identification 
5 of new target proteins which are causing the change in 
the biological function of the cell. Such target proteins 
can eventually be used in the development of drugs by 
e.g. conventional medicinal chemistry or synthetic pep- 
tide libraries. 

10 

Accordingly, the method of the invention comprises the 
following steps: (a) production of a pool of appropriate 
vectors each containing totally or partly random DNA se- 
quences, (b) efficient transduction of said vectors into 

15 a number of identical, e.g. mammalian, cells in such a 
way that a single ribonucleic acid and possibly peptide 
is expressed or a limited number of different random ri- 
bonucleic acids and peptides are expressed by each cell, 
(c) screening of said transduced cells to see whether 

20 some of them have changed a certain phenotypic trait, (d) 
selection and cloning of said changed cells, (e) isola- 
tion and sequencing of the vector DNA in said phenotypi- 
cally changed cells, and (f) deducing the RNA and peptide 
sequences from the DNA sequence. 

25 

BRIEF DESCRIPTION OF THE DRAWING 

Figure 1 is a schematic drawing of a standard retroviral 
peptide expression vector. The plasmid form of the vector 

30 (top) carries a cytomegalovirus (CMV) promoter directing 
expression of a retroviral RNA (middle) with a backbone 
(R, U5, PBS, @, PPT, U3, and R) from Akv murine leukaemia 
virus and a peptide translation cassette followed by an 
internal ribosomal entry site (IRES) from EMC virus di- 

35 recting translation of a Neomycin resistance gene (Neo) . 
The vector provirus in the target cell (below) contains a 
regenerated retroviral long terminal repeat - LTR (U3, R, 
and U5). The CMV promoter sequence provides a unique tag 
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for efficient initial PCR mutagenesis and amplification. 
The size of the vector without an inserted peptide ex- 
pression cassette is 4.0 kb. 

5 DESCRIPTION OF THE PREFERRED EMBODIMENTS 

A built-in requirement of all the presently known peptide 
library techniques is the necessity of a detailed knowl- 
edge about the mechanisms by which a given receptor or 

10 enzyme regulates a certain phenotypic trait of the. cell. 
This receptor or enzyme furthermore has to be available 
in a relatively pure form before a ligand can be selected 
in either of the two types of peptide libraries. When po- 
tential peptide ligands eventually have been identified, 

15 functional assays have to be performed to determine 
whether the ligands exert antagonistic or agonistic ef- 
fects on the desired cellular phenotypic trait. 

The method according to the present invention overcomes 

20 this major problem. By the present method it is thus not 
necessary to know the chain of mechanisms, receptors, 
signalling pathways, enzymes etc. which generate the phe- 
nomenon inside or on the surface of the cell since it is 
the resulting biological effect or phenotypic trait which 

25 is screened for. This is achieved by the following steps: 
(a) production of a pool of appropriate vectors each con- 
taining totally or partly random DNA sequences, (b) effi- 
cient transduction of said vectors into a number of iden- 
tical eukaryotic, e.g. mammalian, cells in such a way 

30 that only a single ribonucleic acid and peptide species 
is expressed ( "one cell - one ribonucleic acid or pep- 
tide" ) or optionally a limited number of different random 
ribonucleic acids and peptides are expressed by each 
cell, (c) screening of said transduced cells to see 

35 whether some of these have changed a certain phenotypic 
trait, (d) selection and cloning of said changed cells, 
(e) isolation and sequencing of the vector DNA in said 
phenotypically changed cells, and (f) deducing the RNA 
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and peptide sequences from the DNA sequence. Either the 
RNA or the peptide encoded by the isolated DNA sequences 
may be the cause of said phenotypic changes of the cell 
and may therefore possess biological activity. 

5 

The introduced peptides are expressed e.g. in the cyto- 
plasm of biologically interesting cells, which before 
that were totally identical, using e.g. the retroviral 
vector systems described in Example 1. They are expressed 
10 from a pool of vectors containing random DNA sequences 
which has been constructed e.g. as described in Example 
1. Using an appropriate ratio between infective retrovi- 
rus; and non-infected cells only a single DNA copy derived 
from the pool of retrovirus vectors is introduced into 
15 each cell. The major advantage by this "one cell - one 
ribonucleic acid or peptide" concept is that cells which 
have changed phenotypic ally upon the introduction of pep- 
tides can be isolated by cloning and selection methods, 
and that the active peptide causing the phenotypic change 
20 can subsequently be identified. This is accomplished by 
isolating the DNA fragment encoding the peptide, e.g. by 
Polymerase Chain Reaction (PCR) technology, and subse- 
quently identifying the DNA sequence. 

25 During the initial screening procedure a larger number of 
retrovirus vectors can be introduced into each cell which 
enables the individual cell to express a number of dif- 
ferent ribonucleic acids or peptides. When a phenotypi- 
cally changed cell clone subsequently has been isolated 

30 all retroviral DNA in that particular clone can be iso- 
lated by PCR, and the PCR product can be used for re- 
transfection of the packaging cells ordinarily used for 
virus production. The retroviral vectors isolated from 
these packaging cells can subsequently be used for trans- 

35 duction of new biologically interesting cells using the 
"one cell - one ribonucleic acid or peptide" concept. Fi- 
nally, after a second cloning procedure the active sub- 
stance can be identified as described above. Further, the 
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biologically active ribonucleic acid, peptide or protein 
isolated by this method can be utilized either as a bad 
compound for drug development or as an affinity ligand 
with the purpose of isolating and identifying the protein 
5 target responsible for the biological activity- Such tar- 
get proteins are very useful tools in drug development. 

The use of small vectors (below 3 kb of DNA) has the ma- 
jor advantage of allowing simple PCR-mutagenesis and am- 

10 plification without ligation and cloning steps. Another 
important advantage of using small vectors is that the 
vectors in a pool of target cells can be amplified di- 
rectly by PCR and retransfected into packaging cells, 
hence allowing multiple rounds of selection to remove 

15 time-consuming analysis of false positives or contaminat- 
ing cells. Direct sequence analysis of the derived random 
plasmid clones is used to assure the randomness of the 
expression library. The standard vector capable of ex- 
pressing the random peptide library is shown schemati- 

20 cally in Fig. 1. 

The cells which are found to have changed phenotypically 
upon the introduction of the random DNA sequences could 
alternatively have changed as a consequence of interac- 

25 tions with the ribonucleic acid molecule transcribed from 
the introduced DNA f in analogy to the described libraries 
consisting of randomized ribonucleotides or deoxyribonu- 
cleotides, the so-called aptamer libraries. Such ribonu- 
cleic acid molecules would therefore also possess bio- 

30 logical activity. Furthermore, the observed effect could 
also be due to biological activity of carbohydrate moie- 
ties or other post-translational modifications on the ex- 
pressed peptides in the cell. Using an appropriate puri- 
fication-tag on the peptides in the library, it would be 

35 possible in that case to purify these and analyze the ex- 
act chemical structure of the post-translational modifi- 
cations in question. 
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Since the efficiency of the non-viral methods commonly 
used for stable gene transfer into mammalian cells is 
very low, it would not be possible to establish a peptide 
expression library in mammalian cells by such methods. In 

5 addition non-viral methods generally lead to multiple in- 
tegrations of DNA in the cell genome in disagreement with 
the "one cell one ribonucleic acid or peptide" concept. 
, In order to achieve the necessary high efficiency single 
gene copy transfer a viral vector must be used. Very re- 

10 cently cDNA expression libraries which were constructed 
using retroviral vectors have been described. From such 
libraries cytokines and cellular growth factors have been 
isolated (A.J.M. Murphy et al., Proc. Natl. Acad. Sci. 
USA, 84, 8277-8281, 1987., B. Y. Wong et al . , J. Virol., 

15 68, 5523-31, 1994., J.R. Rayner et al., Mol. Cell. Biol., 
14, 880-887, 1994). Expression of well defined peptides 
in transfected eukaryotic cells has also previously been 
established, although not using retroviral vectors (M.S. 
Malnati et al . , Nature, 357, 702-704, 1992., E.O. Long et 

20 al., J. Immunol., 153, 1487-1494, 1994). A library of 
random peptides has never been expressed in mammalian 
cells with the purpose of identifying biologically active 
peptides or ribonucleic acids. 

25 Immunology is an important biological field where the 
method according to the invention can be used. T cells 
only recognize fragments of protein antigens, and only if 
these are bound to MHC molecules. Two types of MHC mole- 
cules - the class I and II molecules - present antigen 

30 fragments to T cells. The peptides presented by MHC class 
I molecules, which are on the surface of essentially all 
nucleated cells, are 8-9 amino acids long and generally 
derived from proteins in the cytosol of the cell. These 
can be self -proteins , viral proteins, peptides introduced 

35 into the cytosol by trans fection or tumor antigens. It is 
of considerable interest to be able to identify such pep- 
tides or T cell epitopes e.g.. with regard to vaccine de- 
velopment or in immunotherapy of cancer. Identification 
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of such fragments is, however, a very difficult, demand- 
ing and some times impossible task requiring large 
amounts of affinity purified MHC molecules derived from 
the tumor cell in question and iterative combinations of 
advanced mass spectrometry, HPLC and functional T cell 
assays. Furthermore, most peptide antigens cannot be 
identified due to the presence of very low amounts of the 
individual peptides on the MHC molecules. In Example 2 it 
is demonstrated that the method according to the present 
invention can be used for identification of said T cell 
epitopes. 



10 



15 



20 



25 



30 



35 



Cell lines expressing biologically important surface 
molecules can also be transduced with a random peptide 
library according to the invention. An example of such 
molecules could be the B7 co-stimulatory molecule, which 
is known to be important for activation of T cells, or 
the select in family of proteins which are known to be in- 
volved in the homing of inflammatory cells to inflamed 
tissue. Cells which change phenotype ( e . g . either up- or 
down-regulate B7) can be selected and cloned as described 
in Example 3. After isolation of the transduced DNA by 
PCR new cells can be transduced with the isolated DNA to 
confirm the observation. Subsequently, the peptide se- 
quence can be deduced from the DNA sequence. 

It has been described by others that specific monoclonal 
antibodies or F(ab) fragments can be expressed in the cy- 
toplasm of a cell and exert a biological activity there 
(T.M. Werge et al . , Febs Letters, 274, 193-198, 1990). 

According to the present invention the wholly or partly 
random peptide sequence can also be introduced into the 
variable region of an antibody F(ab) fragment. Therefore, 
a library of F(ab) fragments containing random peptide 
sequences can be expressed in a cell clone in a way that 
each all express a single antibody specificity. Subse- 
quently biologically changed cells are isolated and 
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cloned, and the identified intracellular F(ab) fragment 
can be used for purification of the target protein in- 
volved in the biological proteins . Such target protein 
can subsequently be used for development of drugs capable 
5 of modifying said target proteins. 

The invention is illustrated by the following examples: 
EXAMPLE 1 

10 

Construction of an intracellular eukaryotic peptide li- 
brary 

A retrovirus vector capable of expressing random peptide 

15 sequences is constructed. If the random DNA sequences 
used in the vector were produced using conventional ran- 
dom oligonucleotide synthesis a large number of stop 
codons inevitably would be introduced. Furthermore, due 
to the degeneracy of the genetic code an uneven distribu- 

20 tion of the encoded amino acids would be the result. We 
avoid this by producing the random DNA sequences by ran- 
dom codon synthesis: An appropriate amount of resin used 
for solid phase oligonucleotide synthesis (optionally al- 
ready containing a DNA sequence corresponding to an ap- 

25 propriate vector cloning site) is divided into 20 differ- 
ent portions. On portion no. 1 a conventional solid phase 
chemical synthesis of three bases corresponding to a 
codon encoding the amino acid, alanine, is performed. On 
portion no. 2a codon encoding cysteine is synthesized 

30 and so forth. When each of the 20 portions have been cou- 
pled with codons corresponding to each of the natural 
amino acids, all portions are mixed and divided again 
into 20 equally sized resin portions. The codon synthesis 
is then repeated again, and the whole procedure is re- 

35 peated until the desired randomized DNA sequence has been 
synthesized - e . g. corresponding to 6-10 random amino ac- 
ids. This can also be achieved by using blocked and pro- 
tected trinucleotide phosphoramidites encoding the 20 
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natural amino acids in a total random oligonucleotide 
synthesis (J. Sondek et al., Proc . Natl. Acad. Sci. USA, 
89, 3581-5, 1992). Finally, other appropriate vector 
cloning sites can be synthesized on all the oligonucleo- 
5 tides before they eventually are cleaved from the resin. 

The pool of random synthetic oligonucleotides can be used 
to generate a pool of vectors with random sequences in 
appropriate positions either by restriction cleavage and 

10 ligation or, preferentially, to avoid inefficient liga- 
tion steps, by PCR-mutagenesis . The procedures follow the 
principles of site-directed PCR-mediated mutagenesis (S. 
Perrin et al., Nucl. Acids Res. 18, 7433-38, 1990), but 
the methodology has been adapted to deliver a complex 

15 mixture of products. Briefly, the random oligonucleotides 
carrying vector sequences flanking the random sequence 
are used as primers in a PCR reaction together with a 
unique terminal vector primer. In order to retain com- 
plexity large quantities of template as well as of vector 

20 and randomized primers are used, and product diversity is 
further ensured by pooling of multiple independent PCR- 
reactions. Subsequently, an overlapping PCR fragment con- 
taining the remaining vector segment is produced by stan- 
dard PCR. Finally, this overlapping segment is joined 

25 with the PCR fragments containing the random DNA using 
another unique set of terminal primers. 

DNA fragments produced by Taq DNA polymerase enzyme may 
contain additional nucleotides at the 3 'DNA strand. These 
30 extra nucleotides will be deleterious for combining two 
PCR products with overlapping termini into one fragment. 
By addition of Klenow DNA polymerase enzyme these nucleo- 
tides can be removed by the 3' go 5' exonuclease activity 
of said enzyme, increasing the combining efficiency. 

35 

Alternatively the PCR product can be trimmed at the ter- 
mini by digestion with restriction enzymes whose recogni- 
tion sequences have been incorporated into the oligonu- 
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cleotides used as primers for the PCR reaction. By utili- 
zation of the temperature cycle method developed by (Lund 
et al., Nucleic Acids Res., 24, 800-801, 1996) the effi- 
ciency of the ligation reaction can be increased and the 
5 ligation product used for direct transfection of the 
packaging cells. 

To maintain diversity, the PCR-genefated linear vector 
DNA is used directly for transfection of packaging cells, 

10 and virions containing the pool of different vectors are 
harvested under transient conditions. Small bicistronic 
single transcript vectors containing a random peptide 
translation cassette followed by an internal ribosomal 
entri site (IRES) from EMC virus directing translation of 

15 a Neomycin (Neo) resistance gene. The Neo gene functions 
as a selection marker to allow titre determination and 
elimination of non-transduced cells, if necessary. Other 
available relevant vectors employ other selectable genes 
such as phleomycin and hygromycin B resistance and have 

20 the peptide translation cassette after the IRES element. 
Alternatively, even smaller vectors, carrying only the 
peptide expression cassette and lacking a selection 
marker can also be used. 

25 The use of small vectors (below 4 kb of DNA) has the ma- 
jor advantage of allowing simple PCR-mutagenesis and am- 
plification without ligation and cloning steps. Another 
important advantage of using small vectors is that the 
vectors in a pool of target cells can be amplified di- 

30 rectly by PCR and retransfected into packaging cells, 
hence allowing multiple rounds of selection to remove 
time-consuming analysis of false positives or contaminat- 
ing cells. Direct sequence analysis of the derived random 
plasmid clones is used to assure the randomness of the 

35 expression library. The standard vector capable of ex- 
pressing the random peptide library is shown schemati- 
cally in Fig* 1. 
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To maintain diversity of the vector pool a high fraction 
of the vector RNA transcripts must be encapsidated into 
retroviral particles. By transfection of the packaging 
cells with a DNA construct expressing a tRNA matching the 
5 corresponding PBS in the retroviral vector we can in- 
crease the production of functional vector containing vi- 
rus particles 10 fold under transient conditions.. 

A new packaging cell line will be generated after a sin- 
10 gle transfection of a construct encoding all retroviral 
proteins. To diminish the risk of generation of replica- 
tion competent virus and to obtain maximal expression the 
vector will have the following simplified outline: pro- 
moter-gag-pol transcript-IRES-phleomycin resistance gene- 
15 iRES-env-polyadenylation signal. One advantange of said 
vector is that the phleoroycin resistance gene enables se- 
lection for high expression and that the sheer size of 
the vector transcript restricts the encapsidation into 
retroviral particles thereby diminishing the risk of gen- 
20 eration of replication competent virus. The size limit 
for encapsidation of RNA transcripts is about 10 kb. 

In addition to traditional packaging cell line a semi- 
packaging cell line with a corresponding minivirus-vector 

25 will be used. The semi-packaging cell line consists of 
vectors encoding two mutated gag-pol transcripts comple- 
menting each other. The use of two different gag-pol 
transcripts reduces the risk of generating wild type vi- 
rus. Each cell in the semi-packaging cell line now con- 

30 tains all retroviral proteins needed for production of 
retroviral particles except the envelope proteins these 
proteins are supplied by the minivirus-vector . This vec- 
tor is a bicistronic vector with following outline LTR- 
PBS-packaging signal-random peptide- IREs-env-polypurine 

35 tract-LTR. This vector will be able to transduce the 
semi-packaging cells as these do not produce envelope 
proteins prior to infection with the minivirus-vector. 
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Thus, infection of the semi-packaging cell will not be 
restricted by receptor interference. 

Restrictions upon the "randomness" of the peptide se- 
5 quence can be introduced, for the purpose of limiting the 
number of available sequences in the mammalian cellular 
library and for introduction of e.g. N-glycosylation 
sites, or other post-translational modifications of all 
expressed peptides if so desired. Purification tags - 
10 e.g. poly-His or others - can also be included in the ex- 
pressed peptides for facilitating the purification and 
identification of the peptide itself. This is necessary 
for the identification of post-translational modifica- 
tions, which were not obvious from the peptide sequence. 

15 

A population of eukaryotic cells is infected with the 
retrovirus carrying genetic constructs containing random 
DNA sequences which encode a library of millions of ran- 
dom peptides. Initially an excess number of virus com- 

20 pared to eukaryotic cells can be used. This leads to ex- 
pression of a number of different peptides within each 
eukaryotic cell. These cells can subsequently be screened 
as described below, and the DNA can be isolated e.g. by 
PGR and used for reinfection of other cells. If an appro- 

25 priate ratio between the number of retrovirus containing 
the random DNA sequences and cells is chosen, each cell 
will be transduced with a different random DNA sequence 
("one ceil - one ribonucleic acid or peptide"). This 
eventually enables the identification of an active pep- 

30 tide. 

The peptide may optionally be targeted to different com- 
partments in the cell by incorporating appropriate signal 
sequences in the translated sequences. 
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EXAMPLE 2 

Identification of T cell epitopes by the use of mammalian 
intracellular expression libraries 

5 

The interleukin 2 (11-2) dependent cell line, CTLL-2, is 
transfected with the murine Major Histocompatibility 
(MHC) class I molecule, K b . Subsequently this cell line 
is infected with a retrovirus peptide library which was 

10 described in Example 1. This CTLL-2 peptide library ex- 
presses a wide range of random peptides, and if appropri- 
ate K b associated anchor residues known to be important 
for peptide binding to K b are introduced in the retrovi- 
ral peptide sequence, a large library of peptides bound 

15 to K b are presented on the surface of the CTLL-2 cells, 

K b restricted T cell hybridomas are generated against an 
appropriate virus antigen using conventional cellular im- 
munological technology (Current Protocols in Immunology, 

20 Eds. Coligan et al., NIH) . Such hybridomas secrete 11-2 
upon recognition of antigen. A T cell hybridoma, which 
recognizes an unknown K b bound virus T cell epitope, is 
subsequently incubated with samples of the K b CTLL-2 li- 
brary. If the hybridoma recognizes a peptide, the CTLL-2 

25 cell presenting the peptide will be stimulated to prolif- 
erate by the 11-2 secreted by the hybridoma. In that way 
the CTLL-2 cell expressing the unknown virus T cell epi- 
tope can be selected and cloned. From this clone the DNA 
sequence encoding the peptide epitope in question can be 

30 isolated using PCR technology followed by conventional 
DNA sequencing. This eventually leads to identification 
of the unknown virus T cell epitope. 

EXAMPLE 3 

35 

Identification of biologically active peptides or ribonu- 
cleic acids which regulate cell surface expression of 
proteins 
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Cells expressing the immunoregulatory membrane molecule, 
B7, are infected with the random peptide libraries con- 
structed as described in Example 1. In this example a 
random eight -mer peptide library is introduced. 

5 

Using specific monoclonal antibodies the expression of B7 
is analyzed by conventional methods. Cells which up- or 
down-regulate B7 can be selected either positively, e.g. 
using Fluorescence Activated Cell Sorting, or negatively, 
10 e.g. using appropriate antibodies in combination with ly- 
sis by complement. 

r 

Cells which show changes in expression of B7 are cloned 
by conventional means, and the DNA introduced by retrovi- 

15 ral vectors is isolated using PCR and a set of retrovirus 
specific primers. The peptide sequence or possibly the 
RNA corresponding to said DNA may be able to modify the 
expression of B7 and hence the activation of T cells. 
This can subsequently be tested in conventional T cell 

20 assays. 

EXAMPLE 4 

Identification of a F(ab) fragment capable of modifying 
25 the immunoregulatory molecule B7 . 

A retroviral library encoding the variable heavy chain 
(Vh) as well as the variable light (Vl) gene fragments of 
the immunoglobulin molecule is produced. The gene region 

30 of both gene fragments corresponding to the antigen 
binding site of the resulting F(ab) fragments contains 
furthermore partly random gene sequences as described in 
example 1. This will lead to a large number of diverse 
peptide sequences in the antigen binding site of the 

35 F(ab) fragment. The retroviral vector library therefore 
encodes a large number of different F(ab) fragments with 
different antigen binding specification. 
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This library is subsequently transduced into a cell clone 
in such a way that each cell expresses a single F(ab) 
fragment species in e.g. the cytoplasm. Phenotypically 
changed cells are subsequently cloned and the sequence 

5 of the peptide in the intracellular (Fab) fragment is 
identified as described in example 1. This antibody can 
subsequently be produced in large scale by conventional 
means and be used for affinity purification of the cellu- 
lar target protein responsible for the biological change 

10 of the cell phenotype. This can e.g. be done from lysates 
produced from the original non-modified cell clone. 

Alternatively the retroviral F(ab) library can be con- 
structed using e.g. a poly-His tag or other appropriate 

15 tags. In that way the antibody and the corresponding tar- 
get can be isolated directly from the phenotypically 
changed cell by affinity chromatography. The isolated 
target can subsequently be identified by e.g. N-terminal 
amino acid sequencing in combination with conventional 

20 cloning methodology. Such target proteins are very impor- 
tant drug targets for further drug discovery. 
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PATENT CLAIMS 

1. A method for identification of biologically active 
nucleic acids or peptides or their cellular ligands, 
which comprises the steps of (a-) production of a pool of 
appropriate vectors each containing a DNA sequence to be 
examined, (b) efficient transduction of said vectors into 
a number of identical eukaryotic cells in such a way that 
a single ribonucleic acid and possibly peptide is ex- 
pressed or a limited number of different ribonucleic ac- 
ids and peptides are expressed, by each cell, (c) screen- 
ing of said transduced cells to see whether some of them 
have changed a certain phenotypic trait, and (d) selec- 
tion and cloning of said changed cells, characterized in 
that the pool of appropriate vectors in step (a) contain 
totally or partly random DNA sequences selected from the 
group consisting of: 

i) synthetic totally random DNA sequences; 

ii) synthetic random DNA sequences, in which restric- 
tions upon the randomness may be introduced for the 
purpose of limiting the number of available se- 
quences and/or for the introduction of post- 
translational modifications of expressed peptides; 

iii) synthetic random DNA sequences like (i) or (ii) cou- 
pled to coding sequences of purification tags in or- 
der to facilitate the purification and identifica- 
tion of expressed peptides; and 

iv) synthetic random DNA sequences like (i) , (ii) or 
(iii) coupled to the coding sequence of a protein; 
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and that either 

(e) the vector DNA in the phenotypically changed cells is 
isolated and sequenced, and the sequences of the biologi- 
5 cally active ribonucleic acids or peptides are deduced 
from the sequenced vector DNA; 

or 

10 (f) the biologically active ribonucleic acids or peptides 
expressed in the phenotypically changed cells are used 
directly for isolation of a ligand molecule to said ribo- 
nucleic acid or peptide. 

15 2. A method according to claim 1", in which the peptide 
is a peptide sequence introduced into or fused to a pro- 
tein, preferably a F(ab) fragment or an antibody mole- 
cule. 

20 3.- A method according to claim 1 or 2, in which the 
amino acid sequences of the random peptide library are 
encoded by synthetic DNA sequences/oligonucleotides pro- 
duced by codon split synthesis, where defined DNA codons 
are synthesized in a random order. 

25 

4. A method according to claim 1 or 2, in which the 
amino acid sequences of the random peptide library are 
encoded by synthetic DNA sequences/oligonucleotides pro- 
duced by conventional random oligonucleotide synthesis. 

30 

5. A method according to any one of claims 1-4 in which 
the random DNA sequences are introduced into the expres- 
sion vector by the principle of site directed PCR- 
mediated mutagenesis hereby ensuring the complexity of 

35 the library. 
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6. A method according to claim 5 in which 3 ? -5 r exonu- 
clease trimming of PCR product 3 f ends is used for opti- 
mal combining efficiencies of two such PCR products. 

5 7. A method according to any one of claims 1-6, in which 
te'mperature-cycling ligation is used for optimal ligation 
of a DNA fragment into a vector, maintaining a high di- 
versity of the library for transfection into packaging 
cells ... 

10 

8. A method according to any one of claims 1-7, in which 
the random DNA sequences are introduced into the number 
of eukaryotic cells in such a way that only one DNA se- 
quence is introduced in each cell, one cell expressing 

15 one ribonucleic acid and possibly one peptide, thus ena- 
bling a particular sequence to by isolated and analyzed.. 

9, A method according to any one of claims 1-8, in which 
the random DNA sequences are introduced into the eukary- 

20 otic cells by the use of appropriate viral vectors se- 
lected from e.g. retrovirus or vaccinia virus. 

10. A method according to claim 9, in which the vector 
used is a retroviral vector. 

25 

11, A method according to claim 10, in which the 
retroviral vector has heterologous ends to facilitate 
PCR-based generation of the random DNA sequences. 

30 12. A method according to claim 11, in which the het- 
erologous ends contain two different promoters.. 

13. A method according to any one of claims 10-12, in 
which the retroviral vector contains a CMV promoter re- 
35 placing the viral promoter in the S'-LTR* 
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14. A method according to any one of claims 9-13, in 
which the random DNA sequences are produced as linear PCR 
products which are directly introduced into the virus 
packaging cells by non-viral transfection methods . 

5 

15. A method according' to any one of claims 9-14, in 
which the viral DNA introduced into the cells is ampli- 
fied directly by . PCR and used for retransf ection of new 
target cells with the purpose of eliminating false posi- 

10 tives and/or enabling the "one cell - one ribonucleic 
acid or peptide" concept. 

16. A method according to any one of claims 9-15, in 
which the viral titer of retroviral packaging cell lines 

15 is increased by transient transfection with a functional 
tRNA gene corresponding to the PBS in the vector. 

17. A method according to any one of claims 9-16, in 
.which a packaging cell line constructed from a vector ex- 

20 pressing a single transcript translating the three poly- 
proteins/proteins," gag-pol, a drug resistance gene, and 
the env gene is used. 

18. A method according to any one of claims 9-17, in 
25 which a semi-packaging cell line with a corresponding 

minivirus/vector enabling vector expression after trans- 
duction rather than transfection of cells is used. 

19. A method according to any one of claims 1-18, in 
30 which appropriate restrictions upon the random nature of 

the expressed peptides are introduced such as e.g. glyco- 
sylation sites and anchor residues. 

20. A method according to any one of claims 1-19, in 
35 which the biologically active peptide or protein also 
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contains a purification tag enabling the direct isolation 
of the biologically active protein as well as the target 
protein causing the biological activity. 

5 21. A method according to any one of claims 1-20, in 
which appropriate signal peptides, other leader molecules 
or recognition sequences also are encoded by the intro- 
duced DNA in such a way that they are fused to the ex- 
pressed random peptides, or the expressed proteins con- 

10 taining the random peptide sequences, enabling these to 
be directed towards defined cellular compartments. 

22. A method according to any one of claims 1-21, in 
which the random DNA sequences are introduced into, or 

15 fused to a DNA sequence encoding a protein expressed si- 
multaneously from the library vectors. 

23. A method according to claim 22, in which the protein 
is selected from the group consisting of secreted pro- 

20 teins, intracellular proteins, and membrane proteins e.g. 
signal transducing molecules. 

24. A method according to claim 22 or 23, in which the 
protein is derived wholly or partly from the heavy and/or 

25 light chain of an antibody molecule. 

25. A method according to any one of claims 1-24, which 
is used for identification of T cell epitopes. 

30 26. A method according to any one of claims 1-24, which 
is used for identifying biologically active peptides 
which regulate cell surface expression of proteins. 
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27. Use of a ribonucleic acid or peptide identified by 
the method according to any one of claims 1-26 as a lead 
compound for drug development. 

5 28. Use of a ribonucleic acid or peptide identified by 
the method according to any one of claims 1-26 for isola- 
tion of a cellular ligand interacting with said ribonu- 
cleic acid or peptide. 

10 29. Use of a protein containing a particular amino acid 
sequence identified by the method according to any one of 
claims 1-24 for isolation of a cellular ligand interact- 
ing with said particular amino acid sequence contained in 
said protein. 



15 
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